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We study the distribution of fluctuations over a time scale At (i.e., the returns) of the S&P 500 
index by analyzing three distinct databases. Database (i) contains approximately 1 million records 
sampled at 1 min intervals for the 13-year period 1984-1996, database (ii) contains 8686 daily records 
for the 35-year period 1962-1996, and database (iii) contains 852 monthly records for the 71-year 
period 1926-1996. We compute the probability distributions of returns over a time scale At, where 
At varies approximately over a factor of 10 4 — from 1 min up to more than 1 month. We find that 
the distributions for At < 4 days (1560 mins) are consistent with a power-law asymptotic behavior, 
characterized by an exponent a m 3, well outside the stable Levy regime < a < 2. To test the 
robustness of the S&P result, we perform a parallel analysis on two other financial market indices. 
Database (iv) contains 3560 daily records of the NIKKEI index for the 14-year period 1984-97, and 
database (v) contains 4649 daily records of the Hang-Seng index for the 18-year period 1980-97. We 
find estimates of a consistent with those describing the distribution of S&P 500 daily-returns. One 
possible reason for the scaling of these distributions is the long persistence of the autocorrelation 
function of the volatility. For time scales longer than (At)* ~ 4 days, our results are consistent 
with slow convergence to Gaussian behavior. 



I. INTRODUCTION AND BACKGROUND 



The analysis of financial data by methods developed 

and has re- 




Among 



for physical systems has a long tradition 
cently attracted the interest of physicists 
the reasons for this interest is the scientific challenge 
of understanding the dynamics of a strongly fluctuating 
complex system with a large number of interacting el- 
ements. In addition, it is possible that the experience 
gained by studying complex physical systems might yield 
new results in economics. 

Financial markets are complex dynamical systems with 
many interacting elements that can be grouped into two 
categories: (i) the traders — such as individual investors, 
mutual funds, brokerage firms, and banks — and (ii) the 
assets — such as bonds, stocks, futures, and options. In- 
teractions between these elements lead to transactions 
mediated by the stock exchange. The details of each 
transaction are recorded for later analysis. The dynam- 
ics of a financial market are difficult to understand not 
only because of the complexity of its internal elements 
but also because of the many intractable external fac- 
tors acting on it, which may even differ from market to 
market. Remarkably, the statistical properties of certain 
observables appear to be similar for quite different mar- 
kets [^4|-|6|, consistent with the possibility that there 
may exist "universal" results. 

The most challenging difficulty in the study of a fi- 
nancial market is that the nature of the interactions be- 
tween the different elements comprising the system is un- 
known, as is the way in which external factors affect it. 
Therefore, as a starting point, one may resort to empiri- 
cal studies to help uncover the regularities or "empirical 



laws" that may govern financial markets. 

The interactions between the different elements com- 
prising financial markets generate many observables such 
as the transaction price, the share volume traded, the 
trading frequency, and the values of market indices 
[Fig. p|. A number of studies investigated the time se- 
ries of returns on varying time scales At in order to 
probe the nature of the stochastic process underlying 
it JlO| [l5| , p7| , p8t . For a time series S(t) of prices or market 
index values, the return G(t) = Ga*(£) over a time scale 
At is defined as the forward change in the logarithm of 

s(t) ||, 



G At {t) = In S(t + At) - In S(t) . 



(1) 



For small changes in S(t), the return Ga*(£) is approxi- 
mately the forward relative change, 



G At (t) 



S(t + At) - S{t) 

W) ' 



(2) 



In 1900, Bachelier proposed the first model for the 
stochastic process of returns — an uncorrelated random 
walk with independent, identically Gaussian distributed 
(i.i.d) random variables 0. This model is natural if one 
considers the return over a time scale At to be the re- 
sult of many independent "shocks" , which then lead by 
the central limit theorem to a Gaussian distribution of 
returns jj. However, empirical studies p 10-13 show 
that the distribution of returns ]30| has pronounced tails 
in striking contrast to that of a Gaussian. To illustrate 
this fact, we show in Fig. || the 10 min returns of the 
S&P 500 market index |yj for 1986-1987 and contrast 
it with a sequence of i.i.d. Gaussian random variables. 
Both are normalized to have unit variance. Clearly, large 
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events are very frequent in the data, a fact largely un- 
derestimated by a Gaussian process. Despite this empir- 
ical fact, the Gaussian assumption for the distribution of 
returns is widely used in theoretical finance because of 
the simplifications it provides in analytical calculation; 
indeed, it is one of the assumptions used in the classic 
Black-Scholes option pricing formula |52| . 

In his pioneering analysis of cotton prices, Mandelbrot 
observed that in addition to being non-Gaussian, the pro- 
cess of returns shows another interesting property: "time 
scaling" — that is, the distributions of returns for vari- 
ous choices of At, ranging from 1 day up to 1 month have 
similar functional forms Q . Motivated by (i) pronounced 
tails, and (ii) a stable functional form for different time 
scales, Mandelbrot proposed that the distribution of 
returns is consistent with a Levy stable distribution [^]||] 
- that is, the returns can be modeled as a Levy stable 
process. Levy stable distributions arise from the general- 
ization of the central limit theorem to random variables 
which do not have a finite second moment [see Appendix 
A]. 

Conclusive results on the distribution of returns are 
difficult to obtain, and require a large amount of data to 
study the rare events that give rise to the tails. More 
recently, the availability of high frequency data on finan- 
cial market indices, and the advent of improved comput- 
ing capabilities, has facilitated the probing of the asymp- 
totic behavior of the distribution. For these reasons, re- 
cent empirical studies of the S&P 500 index Q 
analyze typically 10 6 -l(r data points, in contrast to ap- 
proximately 2000 data points analyzed in the classic work 
of Mandelbrot [||. Reference (l0| reports that the cen- 
tral part of the distribution of S&P 500 returns appears 
to be well fit by a Levy distribution, but the asymptotic 
behavior of the distribution of returns shows faster decay 
than predicted by a Levy distribution. Hence, Ref. ]Tc| ] 
proposed a truncated Levy distribution — a Levy distri- 
bution in the central part followed by an approximately 
exponential truncation — as a model for the distribution 
of returns. The exponential truncation ensures the exis- 
tence of a finite second moment, and hence the truncated 
Levy distribution is not a stable distribution |33|]34| ] . The 
truncated Levy process with i.i.d. random variables has 
slow convergence to Gaussian behavior due to the Levy 
distribution in the center, which could explain the ob- 
served time scaling for a considerable range of time scales 

In addition to the probability distribution, a comple- 
mentary aspect for the characterization of any stochastic 
process is the quantification of correlations. Studies of 
the autocorrelation function of returns show exponential 
decay with characteristic decay times r c h of only 4min 
p7| , ^5|j36| . As is clear from Fig. ||(a), for time scales 
beyond 20 min the correlation function is at the level 
of noise, in agreement with the efficient market hypoth- 
esis which states that is not possible to predict future 
stock prices from their previous values |57|. If price- 
correlations were not short-range, one could devise a way 



to make money from the market indefinitely. 

It is important to note that lack of linear correlation 
does not imply an i.i.d. process for the returns, since 
there may exist higher-order correlations [Fig|^(b)]. In- 
deed, the amplitude of the returns, referred to in eco- 
nomics as the volatility ps| ], shows long-range time cor- 
relations that persist up to several months |^,^6| [f5[ , 
and are characterized by an asymptotic power-law decay. 



II. MOTIVATION 



A recent preliminary study reported that the distribu- 
tions of 5 min returns for 1000 individual stocks and the 
S&P 500 index decay as a power-law with an exponent 
well outside the stable Levy regime p6[ . Consistent re- 
sults were found by studies both on stock markets |24|] 
and on foreign exchange markets |^7j . These results raise 
two important questions: 

First, the distribution of returns has a finite second 
moment, thus, we would expect it to converge to a Gaus- 
sian because of the central limit theorem. On the other 
hand, preliminary studies suggest the distributions of re- 
turns retain their power-law functional form for long time 
scales. So, we can ask which of these two scenarios is 
correct? We find that the distributions of returns retain 
their functional form for time scales up to approximately 
4 days, after which we find results consistent with a slow 
convergence to Gaussian behavior. 

Second, power-law distributions are not stable dis- 
tributions, but the distribution of returns retains 
its functional form for a range of time scales. It 
is then natural to ask how can this scaling behav- 
ior possibly arise? One possible explanation is the 
recently -propo sed exponentially-truncated Levy distri- 
bution jl(|[53|,|4). However, the truncated Levy process 
is constructed out of i.i.d. random variables and hence is 
not consistent with the empirically-observed long persis- 
tence in the autocorrelation function of the volatility of 
returns (2^J|^-Q. Moreover, our data support the pos- 
sibility that the asymptotic nature of the distribution is 
a power-law with an exponent outside the Levy regime. 
Also, we will argue that the scaling behavior observed in 
the distribution of returns may be connected to the slow 
decay of the volatility correlations. 

The organization of the paper is as follows. Section III 
describes the data analyzed. Sections IV and V study 
the distribution of returns of the S&P 500 index on time 
scales At < 1 day and At > 1 day, respectively. Sec- 
tion VI discusses how time correlations in volatility are 
related to the time scaling of the distributions, and Sect. 
VII presents concluding remarks. 
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III. THE DATA ANALYZED 

First, we analyze the S&P 500 index, which comprises 
500 companies chosen for market size, liquidity, and in- 
dustry group representation in the US. The S&P 500 is a 
market-value weighted index (stock price times number 
of shares outstanding), with each stock's weight propor- 
tional to its market value. The S&P 500 index is one of 
the most widely used benchmarks of U.S. equity perfor- 
mance. In our study, we first analyze database (i) which 
contains "high-frequency" data that covers the 13 years 
period 1984-1996, with a recording frequency of less than 
1 min. The total number of records in this database ex- 
ceeds 4.5 x 10 6 . To investigate longer time scales, we 
study two other databases. Database (ii) contains daily 
records of the S&P 500 index for the 35-year period 1962- 
1996, and database (iii) contains monthly records for the 
71-year period 1926-1996. 

In order to test if our results are limited to the S&P 500 
index, we perform a parallel analysis on two other market 
indices. Database (iv) contains 3560 daily records of the 
NIKKEI index of the Tokyo stock exchange for the 14- 
year period 1984-1997, and database (v) contains 4649 
daily records of the Hang-Seng index of the Hong Kong 
stock exchange for the 18-year period 1980-1997. 



IV. THE DISTRIBUTION OF RETURNS FOR 

AT < 1 DAY 



A. The distribution of returns for At = 1 min 

First, we analyze the values of the S&P500 index from 
the high-frequency data for the 13-year period 1984- 
1996, which extends the database studied in Ref. [jl0| by 
an additional 7 years. The data are typically recorded at 
15 second intervals. We first sample the data at 1 min 
intervals and generate a time series S(t) with approx- 
imately 1.2 million data points. From the time series 
S(t), we compute the return G = C?At(£) which is the 
relative change in the index, defined in Eq. (Q). 

In order to compare the behavior of the distribution 
for different time scales At, we define a normalized re- 
turn g = 3At(*) 



G - (Gh 



(3) 



Here, the time averaged volatility v = v(At) is defined 



through v 



{G )t — {G)t and (. . denotes an 



average over the entire length of the time series. Fig- 
ure §(a) shows the cumulative distribution of returns for 
At = 1 min. For both positive and negative tails, we find 
a power-law asymptotic behavior 



similar to what was found for individual stocks [Q . For 
the region 3 < g < 50, regression fits yield 



_ / 3.05 ± 0.04 (positive tail) 
2.94 ± 0.08 (negative tail) 



(5) 



well outside the Levy stable range, < a < 2 . Consis- 
tent values for a are also obtained from the density func- 
tion. For a more accurate estimation of the asymptotic 
behavior, we use the modified Hill estimator [Fig. ||(a,b)]. 
We obtain estimates for the asymptotic slope in the re- 
gion 3 < g < 50 : 



_ / 2.93 ± 0.11 (positive tail) 
3.02 ±0.15 (negative tail) 



(6) 



For the region g < 3, regression fits yield smaller esti- 
mates of a, consistent with the possibility of a Levy dis- 
tribution in the central region. The values of a obtained 
in this range are quite sensitive to the bounds of the re- 
gion used for fitting. Our estimates range from a ~ 1.35 
up to a w 1.8 for different fitting regions in the interval 
0-1 < g < 6. For example, in the region 0.5 < g < 3, we 
obtain 



1.6 (positive tail) 

1.7 (negative tail) 



(7) 



which are consistent with the result a w 1.4 found for 
small values of g in Ref. M]. Note that in Ref. |l(J the 
estimates of a were calculated using the scaling form of 
the return probability to the origin -P(O). It is possi- 
ble that for the financial data analyzed here, P(0) is not 
the optimal statistic, because of the discreteness of the 
individual-company distributions that comprise it fl48[ . 
It is also possible that our values of a for small values of 
g could be due to the discreteness in the returns of the 
individual companies comprising the S&P 500. 



B. Scaling of the distribution of returns for At up to 

1 day 

Next, we study the distribution of normalized returns 
for longer time scales. Figure ^(a) shows the cumula- 
tive distribution of normalized S&P 500 returns for time 
scales up to 512 min (approximately 1.5 days). The dis- 
tribution appears to retain its power-law functional form 
for these time scales. We verify this scaling behavior by 
analyzing the moments of the distribution of normalized 
returns g, 



Ii k 



:\g\ k h 



(8) 



P(g>x) , 



(4) 



where (. . .)t denotes an average over all the normalized 
returns for all the bins. Since a « 3, we expect /ifc to 
diverge for k > 3, and hence we compute fj-k f° r k < 3. 

Figure ^|(b) shows the moments of the normalized re- 
turns g for different time scales from 5 min up to 1 day. 
The moments do not vary significantly for the above time 
scales, confirming the apparent scaling behavior of the 
distribution observed in Fig. ||(a). 
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V. THE DISTRIBUTION OF RETURNS FOR 

AT > 1 DAY 



VII. VOLATILITY CORRELATIONS AND TIME 
SCALING 



A. The S&P 500 index 

For time scales beyond 1 day, we use database (ii) 
which contains daily-sampled records of the S&P 500 in- 
dex for the 35-year period 1962-1996. Figure ^(a) shows 
the agreement between distributions of normalized S&P 
500 daily-returns from database (i), which contains 1 min 
sampled data, and database (ii), which contains daily- 
sampled data. Regression fits for the region 1 < g < 10 
give estimates of a ss 3. Figure 0(b) shows the scaling 
behavior of the distribution for At = 1, 2, and 4 days. 
For these choices of At,the scaling behavior is also visible 
for the moments [Fig. 0(c)]. 

Figure |](a) shows the distribution of the S&P 500 re- 
turns for At = 4, 8 and 16 days. The data are now consis- 
tent with a slow convergence to Gaussian behavior. This 
is also visible for the moments [Fig. ||(b)]. 

B. The NIKKEI and Hang-Seng indices 

The S&P 500 is but one of the many stock market 
indices. Hence, we investigate if the above results re- 
garding the power-law asymptotic behavior of the distri- 
bution of returns hold for other market indices as well. 
Figure ^ compares the distributions of daily returns for 
the NIKKEI index of the Tokyo stock exchange and the 
Hang-Seng index of the Hong Kong stock exchange with 
that of the S&P 500. The distributions have similar func- 
tional forms, suggesting the possibility of "universal" be- 
havior of these distributions. In addition, the estimates 
of a from regression fits, 

/ 3.05 ±0.16 (NIKKEI) , . 

a \ 3.03 ±0.16 (Hang-Seng)' [ ' 

are in good agreement for the three cases. 

VI. DEPENDENCE OF AVERAGE VOLATILITY 
ON TIME SCALE 



We have presented evidence that the distributions of 
returns retain the same functional form for a range of 
time scalesfsee Fig. [lO] and Table |j. Here, we inves- 
tigate possible causes of this scaling behavior. Previ- 
ous explanations of scaling relied on Levy stable Q and 
exponentially-truncated Levy processes [p|JlC|]. However, 
the empirical data that we analyze are not consistent 
with either of these two processes. 



A. Rate of convergence 

Here, we compare the rate of convergence of the proba- 
bility of the returns to that of a computer-generated time 
series which has the same distribution but is statistically 
independent by construction. This way, we will be able to 
study the convergence to Gaussian behavior of indepen- 
dent random variables distributed as a power-law, with 
an exponent a « 3. 

First, we generate a time series X = Xk , k = 
1, . . . , 40 x 10 6 distributed as P(X > x) ~ l/x 3 . We next 
calculate the new random variables /„ = Ylj—i Xf., and 
compute the cumulative distributions of I n for increas- 
ing values of n. These distributions show faster conver- 
gence with increasing n than the distributions of returns 
[Fig. |ll](a)]. This convergence is also visible in the mo- 
ments. Figures [ll](a,b) show that for n = 256, both the 
moments and the cumulative distribution show Gaussian 
behavior. In contrast, for the distribution of returns, 
we observe significantly slower convergence to Gaussian 
behavior: In the case of the S&P 500 index, one ob- 
serves a possible onset of convergence for At « 4 days 
(1560 mins), starting from 1 min returns. 

These results confirm the existence of time depen- 
dencies in the returns |27],|36| [43|. Next, we show that 
the scaling behavior observed for the S&P 500 index no 
longer holds when we destroy the dependencies between 
the returns at different times. 



The behavior of the time-averaged volatility v(At) as 
a function of the time scale At is shown in Fig. ||(c). We 
find a power-law dependence, 

v(At) oc {At) s . (10) 

We estimate 6 « 0.7 for time scales At < 20 min. This 
value is larger than 1/2 due to the exponentially-damped 
time correlations, which are significant up to approxi- 
mately 20 min. Beyond 20 min, S w 0.5, indicating the 
absence of correlations in the returns, in agreement with 
Fig. |(a). The time -averaged volatility is also consistent 
with essentially uncorrelated behavior for the daily and 
monthly returns. 



B. Randomizing the time series of returns 

We start with the 1 min returns and then destroy all 
the time dependencies that might be present by shuffling 
the time series of GAt=i(t), thereby creating a new time 
series Gl h (t) which contains statistically-independent re- 
turns. By adding up n consecutive returns of the shuffled 
series Gl h (t), we construct the nmin returns G^(t). 

Figure |l2](a) shows the cumulative distribution of 
G s ^{t) for increasing values of n. We find a progres- 
sive convergence to Gaussian behavior with increasing n. 
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This convergence to Gaussian behavior is also clear in the 
moments of G^(t), which rapidly approach the Gaussian 
values with increasing n [Fig. [jj(b)]. This rapid con- 
vergence confirms that the time dependencies cause the 
observed scaling behavior. 



VIII. DISCUSSION 

We have presented a detailed analysis of the distribu- 
tion of returns for market indices, for time intervals At 
ranging over roughly 4 orders of magnitude, from 1 min 
up to 1 month 16,000 min). We find that the distribu- 
tion of returns is consistent with a power-law asymptotic 
behavior, characterized by an exponent a ~ 3, well out- 
side the stable Levy regime < a < 2. For time scales 
At ^> (At)x, where (At) x ~ 4 days, our results are con- 
sistent with slow convergence to Gaussian behavior. 

We have also demonstrated that the scaling behavior 
does not hold if we destroy all the time dependencies 
by shuffling. The breakdown of the scaling behavior of 
the distribution of returns upon shuffling the time se- 
ries suggests that the long-range volatility correlations, 
which persist up to several months |27],|3(| |]|, may be 
one possible reason for the observed scaling behavior. 

Recent studies show that the distribution of 

volatility is consistent with an asymptotic power-law be- 
havior with exponent 3, just as observed for the distri- 
bution of returns. This finding suggests that the process 
of returns may be written as 



g(t) = e(t)v(t), 



(11) 



where g(t) denotes the return at time t, v(t) denotes the 
volatility, and e(t) is an i.i.d. random variable indepen- 
dent of v(t). Since the asymptotic behavior of the dis- 
tributions of v(t) and g{t) is consistent with power-law 
behavior, e(i) should have an asymptotic behavior with 
faster decay than either g(t) or v(tV In fact, Eq. ( |i"i"| ) is 
central to all the ARCH models H9|, with e(t) assumed 
to be Gaussian distributed. 

Different ARCH processes assume different recursion 
relations for v(t). In the standard ARCH model, v(t) = 
a + /3 g 2 (t — 1), leading to a power law distribution of 
returns with exponent depending on the parameters a 
and (3. However, the standard ARCH process predicts a 
volatility correlation that decays exponentially, since v(t) 
depends only on the previous event, and cannot account 
for the observed long-range persistence in v(t). To try 
to remedy this, one can require v(t) to depend not only 
on the previous value of g(t) but on a finite number of 
past events. This generalization is called the GARCH 
model. Dependence of v (t) on the finite past leads not 
to a power-law decay (as is observed empirically), but 
to volatility correlations that decay exponentially — with 
larger decay times as the number of events "remembered" 
is increased. 



In order to explain the long range persistence of the au- 
tocorrelation function of the volatility, one must assume 
that v(t) depends on all the past rather than a finite 
number of past events |5(|| . Such a description would be 
consistent with the empirical finding of long-range corre- 
lations in the volatility, and the observation that the dis- 
tributions of g(t) and v(t) have similar asymptotic forms. 
If the process of returns were governed by the volatility, 
as in Eq. (|TT|) , then the volatility would seem to be the 
more fundamental process. In fact, it is possible that the 
volatility is a measure of the amount of information ar- 
riving into the market, and that the statistical properties 
of the returns may be "driven" by this information. 
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APPENDIX A: LEVY STABLE DISTRIBUTIONS 

Levy stable distributions arise from the generalization 
of the central limit theorem to a wider class of distribu- 
tions. Consider the partial sum P n = Y^7=i x i °f inde- 
pendent identically distributed {i.i.d.) random variables 
Xi. If the XiS have finite second moment, the central 
limit theorem holds and P n is distributed as a Gaussian 
in the limit n — > oo. 

If the random variables characterized by a dis- 

tribution having asymptotic power-law behavior 



P(x) 



-(!+«) 



(Al) 



where a < 2, then P n will converge to a Levy stable 
stochastic process of index a in the limit n — > oo. 

Except for special cases, such as the Cauchy distri- 
bution, Levy stable distributions cannot be expressed in 
closed form. They are often expressed in terms of their 
Fourier transforms or characteristic functions, which we 
denote tp(q), where q denotes the Fourier transformed 
variable. The general form of a characteristic function of 
a Levy stable distribution is 



lnp(g) 



i f iq-j\q\ a [l + ip-fatg(%a) 
iw-i\q\ 1 + */ 3 yfyf m M 



[a = l] ' 
(A2) 
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where < a < 2, 7 is a positive number, /1 is the mean, 
and (3 is an asymmetry parameter. For symmetric Levy 
distributions ((3 = 0), one has the functional form 



P(x) = 



1 

2^ 



exp(--f\q\ a ) e- lqx dq . 



(A3) 



For a = 1, one obtains the Cauchy distribution and for 
the limiting case a = 2, one obtains the Gaussian distri- 
bution. 

By construction, Levy distributions are stable, that is, 
the sum of two independent random variables x\ and X2, 
characterized by the same Levy distribution of index a, 
is itself characterized by a Levy distribution of the same 
index. The functional form of the distribution is main- 
tained, if we sum up independent, identically distributed 
Levy stable random variables. 

For Levy distributions, the asymptotic behavior of 
P(x) for x » 1 is a power-law, 



P{x) 



»-(!+«) 



(A4) 



Hence, the second moment diverges. Specifically, 
diverges for n > a when a < 2. In partic- 
ular, all Levy stable processes with a < 2 have infinite 
variance. Thus, non-Gaussian stable stochastic processes 
do not have a characteristic scale. Although well-defined 
mathematically, these distributions are difficult to use 
and raise fundamental problems when applied to real sys- 
tems where the second moment is often related to the 
properties of the system. In finance, an infinite variance 
would make risk estimation and derivative pricing impos- 
sible. 



APPENDIX B: THE HILL ESTIMATOR ("LOCAL 
SLOPES") 

A common problem when studying a distribution that 
decays as a power law is how to obtain an accurate esti- 
mate of the exponent characterizing the asymptotic be- 
havior. Here, we review the methods of Hill The 
basic idea is to calculate the inverse of the local logarith- 
mic slope C of the cumulative distribution P(g > x), 



c = - 



/ rflogP 
\ d log x 



(Bl) 



We then estimate the inverse asymptotic slope 1 /a by ex- 
trapolating £ as 1/x — > 0. We start with the normalized 
returns g and proceed in the following steps: 

Step I: We sort the normalized returns g in descending 
order. The sorted returns are denoted gk, k = 1, . . . , N, 
where gk > gk+i and N is the total number of events. 

Step II: The cumulative distribution is then expressed 
in terms of the sorted returns as 



(B2) 



Figure |13| is a schematic of the cumulative distribution 
thus obtained. The inverse local slopes £(<?) can be writ- 
ten as 



CCffk) 



log(gfc+i/gfc) 

\og(P(g k+1 )/P(g k )) 



(B3) 



Using Eq. (B2), the above expression can be well approx- 
imated for large k as 



C(Sfe) - Hlog(g k+1 ) - log( 5fe )) , 
yielding estimates of the local inverse slopes. 



(B4) 



Step III: We obtain the inverse local slopes through 
Eq. (B4). We can then compute an average of the inverse 
slopes over m points, 



<c> 



1 m 

-£cG&)> 

m L — * 



(B5) 



fe=i 



where the choice of the averaging window length m varies 
depending on the number of events N available. 

Step IV: We plot the locally averaged inverse slopes 
(0 obtained in Step III as a function of the inverse nor- 
malized returns 1/g [see, e.g., Fig. ||. We can then define 
two methods of estimating a. In the first method, we ex- 
trapolate £ as a function of 1/g to 0, similarly to the 
method of successive slopes |p2(| ; this procedure yields 
the inverse asymptotic slope 1/a. In the second method, 
we average over all events for 1/g smaller than a given 
threshold |5l| , with the average yielding the inverse slope 
1/a. 

To test the Hill estimator, we analyze two surrogate 
data sets with known asymptotic behavior: (a) an inde- 
pendent random variable with P(g > x) = (1 + x) -3 , and 
(b) an independent random variable with P(g > x) = 
exp(— x). As shown in Figs. [l3](b,c), the method yields 
the correct results a = 3 and a = 00, respectively. 
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TABLE I. The values of the exponent a, for different 
time scales At, for the S&P 500 index: (a) power-law re- 
gression fit to the cumulative distribution, and (b) Hill es- 
timator. The daggered values are computed using database 
(ii), which contains daily-sampled records, while the values 
without the dagger are computed using database (i), which 
contains records with a 1 min sampling. Note that we use the 
conversion 1 day=390 min. |53|, 
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FIG. 1. The S&P 500 index is the sum of the market capi- 
talizations of 500 companies. In (a), we display both the value 
of the S&P 500 index (bottom line) and the index detrended 
by inflation to 1994 US dollars (top line). The sharp jump 
seen in 1987 is the market crash of October 19. (b) Compari- 
son of the time evolution of the S&P 500 for the 35-year period 
1962-96 (top line) and a biased Gaussian random walk (bot- 
tom line). The random walk has the same bias as the S&P 
500 — approximately 7% per year for the period considered. 
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FIG. 2. Sequence of (a) lOmin returns, from database (i), 
and (b) 1 month returns, from database (iii), for the S&P 500, 
normalized to unit variance, (c) Sequence of i.i.d. Gaussian 
random variables with unit variance, which was proposed by 
Bachelier as a model for stock returns For all 3 panels, 
there are 850 events — i.e., in panel (a) 850 minutes and in 
panel (b) 850 months. Note that, in contrast to (a) and (b), 
there are no "extreme" events in (c). 
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FIG. 3. (a) Semilog plot of the autocorrelation function 
for the S&P 500 returns GAt(t) sampled at a At — 1 min 
time scale, 
C At (r) = [{G A t(t) GAt(t+r))-(GAt(t)) 2 ]/[(GAt(t) 2 )~(GAt(t)) 2 } 
The straight line corresponds to an exponential decay with a 
characteristic decay time t c h = 4 min. Note that after 20 min 
the correlations are at the noise level, (b) Loglog plot of the 
autocorrelation function of the absolute returns. The solid 
line is a power-law regression fit over the entire range, which 
gives an estimate of the power-law exponent, r\ — 0.29 ± 0.05. 
Better estimates of this exponent can be obtained from the 
power spectrum or from other more sophisticated methods. It 
has been recently reported using such methods that the auto- 
correlation function of the absolute value of the returns shows 
two power-law regimes with a crossover at approximately 
1.5 days p5| . (c) Loglog plot of the time averaged volatility 
v = v(At) as a function of the time scale At of the returns 
obtained from databases (i-iii). For At < 20 min, we observe 
a slope 8 — 0.67±0.03, due to the exponentially-damped time 
correlations. For At > 20 min, we observe 8 = 0.51 ± 0.06, 
indicating the absence of significant correlations. 
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FIG. 4. (a) Loglog plot of the cumulative distribution 
of the normalized 1 min returns for the S&P 500 index. 
Power-law regression fits in the region 3 < g < 50 yield 
a = 2.95 ± 0.07 (positive tail), and a = 2.75 ± 0.13 (nega- 
tive tail). For the region 0.5 < g < 3, regression fits give 
a — 1.6 ± 0.1 (positive tail), and a — 1.7 ± 0.1 (negative 
tail), (b) Loglog and (c) linearlog plots of the probability 
density function for the normalized S&P500 returns. The 
solid lines are power-law fits with exponents 1 + a ~ 4. 
Power-law regression fits in the region 3 < g < 50 yield es- 
timates a = 3.01 ± 0.11 (positive tail), and a = 3.02 ± 0.08 
(negative tail). 
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FIG. 5. Inverse local slopes of the cumulative distributions 
of normalized returns for At = lmin for the (a) positive and 
(b) negative tails. Each point is an average over 100 differ- 
ent inverse local slopes. Extrapolation of the regression lines 
provides estimates for the asymptotic slopes a — 3.45 ± 0.07 
(positive tail), and a — 3.29 ± 0.07 (negative tail). 
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FIG. 6. (a) Loglog plot of the cumulative dis- 

tribution of normalized returns of the positive tails for 
At = 16, 32, 128, 512 mins. Power-law regression fits yield es- 
timates of the asymptotic power-law exponent a = 2.69±0.04, 
a = 2.53 ± 0.06, a = 2.83 ± 0.18 and a = 3.39 ± 0.03 for 
At = 16,32,128 and 512 mins, respectively, (b) The mo- 
ments of the distribution for At = 1,32,128 and 512 min. 
The change in the behavior of the moments from the 1 min 
scale is probably the effect of the gradual disappearance of 
the Levy slope for small values of g. For At > 30 min there is 
no region with slopes in the Levy range, and we observe good 
agreement between all time scales. 



FIG. 7. (a) Cumulative distribution of the normalized S&P 
500 returns from two different databases: Database (i) which 
contains 1 min records for 13 years, and database (ii) which 
contains daily records for 35 years. Power-law regression fits 
in the region g > 1 lead to the estimates a = 3.75 ± 0.30 for 
database (i), and a — 3.66 ± 0.11 for database (ii). (b) The 
cumulative distribution from database (ii) for At = 1,2 and 
4 days. The apparent scaling behavior of these distributions 
is confirmed by the estimates a — 3.75 ± 0.41 (At = 2 days) 
and a = 3.77 ± 0.29 (At = 4 days), (c) The behavior of 
the moments for these time scales is in agreement with the 
apparent scaling behavior. 
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FIG. 8. (a) Cumulative distribution for the positive tail of 
S&P 500 returns for time scales At = 4, 8 and 16 days. The 
bold curve shows the cumulative distribution of a Gaussian 
with zero mean and unit variance, (b) The moments for time 
scales At = 8 and 16 days are consistent with a slow conver- 
gence to Gaussian behavior. Note that the curves for At = 1 
and 4 days are indistinguishable. 
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FIG. 9. Comparison of the cumulative distributions for 
the positive tails of the normalized returns for the daily 
records of the NIKKEI index from 1984-97, the daily records 
of the Hang-Seng index from 1980-97 and the daily records 
of the S&P500 index. The apparent power law behavior in 
the tails is characterized by the exponents a — 3.05 ± 0.16 
(NIKKEI) , a = 3.03 ±0.16 (Hang-Seng) and a = 3.34 ±0.12 
(S&P500). The fits are performed in the region g > 1. 
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FIG. 10. The values of the exponent a characterizing 
the asymptotic power-law behavior of the distribution of re- 
turns as a function of the time scale At obtained using (a) 
a power-law fit, and (b) the Hill estimator. The values of a 
for At <1 day are calculated from database (i) which con- 
tains 13 years of 1 min records, while for At >1 day they 
are calculated from database (ii), which has 35 years of daily 
records. The unshaded region, corresponding to time scales 
larger than (At) x ~ 4 days (1560 min), indicates the range of 
time scales where we find results consistent with slow conver- 
gence to Gaussian behavior (see the text and the preceding 
figures) . 
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FIG. 12. We randomize the time series of returns for 
the S&P 500 for Ai=lmin and create a time series with the 
same distribution but with independent random variables. 
We then sum up n consecutive shuffled returns to create a 
shuffled remin return, (a) Cumulative distributions of the 
positive tails of the shuffled returns are shown for increasing 
n. We find slow convergence to Gaussian behavior on in- 
creasing n. (b) The slow convergence to a Gaussian behavior 
is shown by the moments. The results in (b) can be com- 
pared with Fig. pd|(b) if we note that n = 512 corresponds 
to At ~ 1.5 days. The data are normalized to have the same 
second moment. 



FIG. 11. Convergence of distribution for independent 
variables. We first generate a time series Xk. distributed 
as P(X > x) ~ 1/x 3 . We then generate the variables 
I n = 5^-=i for n = 1,16 and 256. (a) Cumulative dis- 
tributions of I n . Note that the curve for n = 256 is indistin- 
guishable from the Gaussian curve revealing convergence to 
Gaussian behavior, (b) The moments for n = 1, 16 and 256. 
These results can be compared with Fig. |^. Note that for the 
S&P 500 even for time scales At — 16 days (corresponding to 
n = 208) we still do not observe a good degree of convergence. 
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FIG. 13. (a) Schematic representation of the evaluation 
of the local slope from the cumulative distribution. First, 
the normalized returns g are sorted in descending order, 
gk > Sfc+i- The dotted line indicates the local slope, (b) 
Hill estimator for a sequence of i.i.d. random variables with 
asymptotic behavior: P(g > x) = (1 + x)~ s . (c) Hill estima- 
tor for a sequence of i.i.d. random variables with asymptotic 
behavior: P(g > x) = exp(— x). Note that the asymptotic 
estimates, 1/a = 0.33 and 1/a = 0, recover for both cases the 
correct values of a, a — 3 and a = oo, respectively. 
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